{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# Returning sources\n",
        "\n",
        "Often in Q&A applications it’s important to show users the sources that were used to generate the answer. The simplest way to do this is for the chain to return the Documents that were retrieved in each generation.\n",
        "\n",
        "We’ll work off of the Q&A app we built over the [LLM Powered Autonomous Agents](https://lilianweng.github.io/posts/2023-06-23-agent/) blog post by Lilian Weng in the [Quickstart](/docs/use_cases/question_answering/quickstart)."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Setup\n",
        "### Dependencies\n",
        "\n",
        "We’ll use an OpenAI chat model and embeddings and a Memory vector store in this walkthrough, but everything shown here works with any [ChatModel](/docs/modules/model_io/chat) or [LLM](/docs/modules/model_io/llms), [Embeddings](/docs/modules/data_connection/text_embedding/), and [VectorStore](/docs/modules/data_connection/vectorstores/) or [Retriever](/docs/modules/data_connection/retrievers/).\n",
        "\n",
        "We’ll use the following packages:\n",
        "\n",
        "```bash\n",
        "npm install --save langchain @langchain/openai cheerio\n",
        "```\n",
        "\n",
        "We need to set environment variable `OPENAI_API_KEY`:\n",
        "\n",
        "```bash\n",
        "export OPENAI_API_KEY=YOUR_KEY\n",
        "```\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### LangSmith\n",
        "\n",
        "Many of the applications you build with LangChain will contain multiple steps with multiple invocations of LLM calls. As these applications get more and more complex, it becomes crucial to be able to inspect what exactly is going on inside your chain or agent. The best way to do this is with [LangSmith](https://smith.langchain.com/).\n",
        "\n",
        "Note that LangSmith is not needed, but it is helpful. If you do want to use LangSmith, after you sign up at the link above, make sure to set your environment variables to start logging traces:\n",
        "\n",
        "\n",
        "```bash\n",
        "export LANGCHAIN_TRACING_V2=true\n",
        "export LANGCHAIN_API_KEY=YOUR_KEY\n",
        "```"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Chain without sources\n",
        "\n",
        "Here is the Q&A app we built over the [LLM Powered Autonomous Agents](https://lilianweng.github.io/posts/2023-06-23-agent/) blog post by Lilian Weng in the [Quickstart](/docs/use_cases/question_answering/quickstart)."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import \"cheerio\";\n",
        "import { CheerioWebBaseLoader } from \"langchain/document_loaders/web/cheerio\";\n",
        "import { RecursiveCharacterTextSplitter } from \"langchain/text_splitter\";\n",
        "import { MemoryVectorStore } from \"langchain/vectorstores/memory\"\n",
        "import { OpenAIEmbeddings, ChatOpenAI } from \"@langchain/openai\";\n",
        "import { pull } from \"langchain/hub\";\n",
        "import { ChatPromptTemplate } from \"@langchain/core/prompts\";\n",
        "import { formatDocumentsAsString } from \"langchain/util/document\";\n",
        "import { RunnableSequence, RunnablePassthrough } from \"@langchain/core/runnables\";\n",
        "import { StringOutputParser } from \"@langchain/core/output_parsers\";"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 10,
      "metadata": {},
      "outputs": [],
      "source": [
        "const loader = new CheerioWebBaseLoader(\n",
        "  \"https://lilianweng.github.io/posts/2023-06-23-agent/\"\n",
        ");\n",
        "\n",
        "const docs = await loader.load();\n",
        "\n",
        "const textSplitter = new RecursiveCharacterTextSplitter({ chunkSize: 1000, chunkOverlap: 200 });\n",
        "const splits = await textSplitter.splitDocuments(docs);\n",
        "const vectorStore = await MemoryVectorStore.fromDocuments(splits, new OpenAIEmbeddings());\n",
        "\n",
        "// Retrieve and generate using the relevant snippets of the blog.\n",
        "const retriever = vectorStore.asRetriever();\n",
        "const prompt = await pull<ChatPromptTemplate>(\"rlm/rag-prompt\");\n",
        "const llm = new ChatOpenAI({ modelName: \"gpt-3.5-turbo\", temperature: 0 });\n",
        "\n",
        "const ragChain = RunnableSequence.from([\n",
        "  {\n",
        "    context: retriever.pipe(formatDocumentsAsString),\n",
        "    question: new RunnablePassthrough(),\n",
        "  },\n",
        "  prompt,\n",
        "  llm,\n",
        "  new StringOutputParser()\n",
        "]);"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 11,
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "\u001b[32m\"Task decomposition is a technique used to break down complex tasks into smaller and simpler steps. I\"\u001b[39m... 208 more characters"
            ]
          },
          "execution_count": 11,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "await ragChain.invoke(\"What is task decomposition?\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Adding sources\n",
        "\n",
        "With LCEL it’s easy to return the retrieved documents:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 12,
      "metadata": {},
      "outputs": [
        {
          "data": {
            "text/plain": [
              "{\n",
              "  question: \u001b[32m\"What is Task Decomposition\"\u001b[39m,\n",
              "  context: [\n",
              "    Document {\n",
              "      pageContent: \u001b[32m\"Fig. 1. Overview of a LLM-powered autonomous agent system.\\n\"\u001b[39m +\n",
              "        \u001b[32m\"Component One: Planning#\\n\"\u001b[39m +\n",
              "        \u001b[32m\"A complicated ta\"\u001b[39m... 898 more characters,\n",
              "      metadata: {\n",
              "        source: \u001b[32m\"https://lilianweng.github.io/posts/2023-06-23-agent/\"\u001b[39m,\n",
              "        loc: { lines: \u001b[36m[Object]\u001b[39m }\n",
              "      }\n",
              "    },\n",
              "    Document {\n",
              "      pageContent: \u001b[32m'Task decomposition can be done (1) by LLM with simple prompting like \"Steps for XYZ.\\\\n1.\", \"What are'\u001b[39m... 887 more characters,\n",
              "      metadata: {\n",
              "        source: \u001b[32m\"https://lilianweng.github.io/posts/2023-06-23-agent/\"\u001b[39m,\n",
              "        loc: { lines: \u001b[36m[Object]\u001b[39m }\n",
              "      }\n",
              "    },\n",
              "    Document {\n",
              "      pageContent: \u001b[32m\"Agent System Overview\\n\"\u001b[39m +\n",
              "        \u001b[32m\"                \\n\"\u001b[39m +\n",
              "        \u001b[32m\"                    Component One: Planning\\n\"\u001b[39m +\n",
              "        \u001b[32m\"                 \"\u001b[39m... 850 more characters,\n",
              "      metadata: {\n",
              "        source: \u001b[32m\"https://lilianweng.github.io/posts/2023-06-23-agent/\"\u001b[39m,\n",
              "        loc: { lines: \u001b[36m[Object]\u001b[39m }\n",
              "      }\n",
              "    },\n",
              "    Document {\n",
              "      pageContent: \u001b[32m\"Resources:\\n\"\u001b[39m +\n",
              "        \u001b[32m\"1. Internet access for searches and information gathering.\\n\"\u001b[39m +\n",
              "        \u001b[32m\"2. Long Term memory management\"\u001b[39m... 456 more characters,\n",
              "      metadata: {\n",
              "        source: \u001b[32m\"https://lilianweng.github.io/posts/2023-06-23-agent/\"\u001b[39m,\n",
              "        loc: { lines: \u001b[36m[Object]\u001b[39m }\n",
              "      }\n",
              "    }\n",
              "  ],\n",
              "  answer: \u001b[32m\"Task decomposition is a technique used to break down complex tasks into smaller and simpler steps. I\"\u001b[39m... 256 more characters\n",
              "}"
            ]
          },
          "execution_count": 12,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "import { RunnableMap, RunnablePassthrough, RunnableSequence } from \"@langchain/core/runnables\";\n",
        "import { formatDocumentsAsString } from \"langchain/util/document\";\n",
        "\n",
        "const ragChainFromDocs = RunnableSequence.from([\n",
        "  RunnablePassthrough.assign({ context: (input) => formatDocumentsAsString(input.context) }),\n",
        "  prompt,\n",
        "  llm,\n",
        "  new StringOutputParser()\n",
        "]);\n",
        "\n",
        "let ragChainWithSource = new RunnableMap({ steps: { context: retriever, question: new RunnablePassthrough() }})\n",
        "ragChainWithSource = ragChainWithSource.assign({ answer: ragChainFromDocs });\n",
        "\n",
        "await ragChainWithSource.invoke(\"What is Task Decomposition\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Check out the [LangSmith trace](https://smith.langchain.com/public/f07e78b6-cafc-41fd-af54-892c92263b09/r)"
      ]
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "Deno",
      "language": "typescript",
      "name": "deno"
    },
    "language_info": {
      "file_extension": ".ts",
      "mimetype": "text/x.typescript",
      "name": "typescript",
      "nb_converter": "script",
      "pygments_lexer": "typescript",
      "version": "5.3.3"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 2
}
